52 research outputs found
High efficiency compression for object detection
Image and video compression has traditionally been tailored to human vision.
However, modern applications such as visual analytics and surveillance rely on
computers seeing and analyzing the images before (or instead of) humans. For
these applications, it is important to adjust compression to computer vision.
In this paper we present a bit allocation and rate control strategy that is
tailored to object detection. Using the initial convolutional layers of a
state-of-the-art object detector, we create an importance map that can guide
bit allocation to areas that are important for object detection. The proposed
method enables bit rate savings of 7% or more compared to default HEVC, at the
equivalent object detection rate.Comment: The paper is published in IEEE ICASSP 18
Can you tell a face from a HEVC bitstream?
Image and video analytics are being increasingly used on a massive scale. Not
only is the amount of data growing, but the complexity of the data processing
pipelines is also increasing, thereby exacerbating the problem. It is becoming
increasingly important to save computational resources wherever possible. We
focus on one of the poster problems of visual analytics -- face detection --
and approach the issue of reducing the computation by asking: Is it possible to
detect a face without full image reconstruction from the High Efficiency Video
Coding (HEVC) bitstream? We demonstrate that this is indeed possible, with
accuracy comparable to conventional face detection, by training a Convolutional
Neural Network on the output of the HEVC entropy decoder
Scalable Video Coding for Humans and Machines
Video content is watched not only by humans, but increasingly also by
machines. For example, machine learning models analyze surveillance video for
security and traffic monitoring, search through YouTube videos for
inappropriate content, and so on. In this paper, we propose a scalable video
coding framework that supports machine vision (specifically, object detection)
through its base layer bitstream and human vision via its enhancement layer
bitstream. The proposed framework includes components from both conventional
and Deep Neural Network (DNN)-based video coding. The results show that on
object detection, the proposed framework achieves 13-19% bit savings compared
to state-of-the-art video codecs, while remaining competitive in terms of
MS-SSIM on the human vision task.Comment: 6 pages, 5 figures, IEEE MMSP 202
The Warped Plane of the Classical Kuiper Belt
By numerically integrating the orbits of the giant planets and of test
particles over a period of four billion years, we follow the evolution of the
location of the midplane of the Kuiper belt. The Classical Kuiper belt conforms
to a warped sheet that precesses with a 1.9 Myr period. The present-day
location of the Kuiper belt plane can be computed using linear secular
perturbation theory: the local normal to the plane is given by the theory's
forced inclination vector, which is specific to every semimajor axis. The
Kuiper belt plane does not coincide with the invariable plane, but deviates
from it by up to a few degrees in stable zones. For example, at a semimajor
axis of 38 AU, the local Kuiper belt plane has an inclination of 1.9 deg and a
longitude of ascending node of 149.9 deg when referred to the mean ecliptic and
equinox of J2000. At a semimajor axis of 43 AU, the local plane has an
inclination of 1.9 deg and a nodal longitude of 78.3 deg. Only at infinite
semimajor axis does the Kuiper belt plane merge with the invariable plane,
whose inclination is 1.6 deg and nodal longitude is 107.7 deg. A Kuiper belt
object keeps its inclination relative to the Kuiper belt plane nearly constant,
even while the latter plane departs from the trajectory predicted by linear
theory. The constancy of relative inclination reflects the undamped amplitude
of free oscillation. Current observations of Classical Kuiper belt objects are
consistent with the plane being warped by the giant planets alone, but the
sample size will need to increase by a few times before confirmation exceeds
3-sigma in confidence. In principle, differences between the theoretically
expected plane and the observed plane could be used to infer as yet unseen
masses orbiting the Sun, but carrying out such a program would be challenging.Comment: Astronomical Journal, in press. This version contains more details in
the abstract and minor proof correction
DFTS: Deep Feature Transmission Simulator
Collaborative intelligence is a deployment paradigm for deep AI models where some of the layers run on the mobile terminal or network edge, while others run in the cloud. In this scenario, features computed in the model need to be transferred between the edge and the cloud over an imperfect channel. Here we present a simulator to help study the effects of imperfect packet-based transmission of deep features. Our simulator is implemented in Keras and allows users to study the effects of both lossy packet transmission and quantization on the accuracy
A Dataset of Labelled Objects on Raw Video Sequences
We present an object labelled dataset called SFU-HW-Objects-v1, which contains object labels for a set of raw video sequences. The dataset can be useful for the cases where both object detection accuracy and video coding efficiency need to be evaluated on the same dataset. Object ground-truths for 18 of the High Efficiency Video Coding (HEVC) v1 Common Test Conditions (CTC) sequences have been labelled. The object categories used for the labeling are based on the Common Objects in Context (COCO) labels. A total of 21 object classes are found in test sequences, out of the 80 original COCO label classes. Brief descriptions of the labeling process and the structure of the dataset are presented
- …